An Ontology-Based Approach for Data Cleaning

نویسندگان

  • Paulo Oliveira
  • Fátima Rodrigues
  • Pedro Rangel Henriques
چکیده

There is no magic solution for data cleaning. The user has always to specify the cleaning operations to perform. A huge number of operations may have to be specified. Yet, this is the condition to detect and correct the data quality problems successfully. Most of the cleaning operations are generic enough to be applied to different databases. These operations may be limited to databases of the same domain or can be so general that are domain independent. The traditional approach to data cleaning is to specify the operations at the database schema level. Several changes are required to reuse a cleaning operation in another database. This paper presents an approach that supports the interoperability of the operations among different databases. This is achieved through an ontological level that supports the conceptual specification of the cleaning operations. This abstraction level isolates them from the schema of the databases and allows their reuse easily.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Executive Approach Based On the Production of Fuzzy Ontology Using the Semantic Web Rule Language Method (SWRL)

Today, the need to deal with ambiguous information in semantic web languages is increasing. Ontology is an important part of the W3C standards for the semantic web, used to define a conceptual standard vocabulary for the exchange of data between systems, the provision of reusable databases, and the facilitation of collaboration across multiple systems. However, classical ontology is not enough ...

متن کامل

A New Ontology-Based Approach for Human Activity Recognition from GPS Data

Mobile technologies have deployed a variety of Internet–based services via location based services. The adoption of these services by users has led to mammoth amounts of trajectory data. To use these services effectively, analysis of these kinds of data across different application domains is required in order to identify the activities that users might need to do in different places. Researche...

متن کامل

Developing a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information

With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...

متن کامل

Ontology Based Data Validation and Cleaning: Restructuring Operations for Ontology Maintenance

Data validation and cleaning are integral processes of the data qualitymanagement cycle. Domain specific knowledge is needed to detect and correct semantic errors. Ontologies can be used to represent valid and invalid attribute value combinations to detect and correct invalid data. We introduce reorganization operations formaintaining such an ontology in the data quality management cycle.

متن کامل

Prioritize the ordering of URL queue in Focused crawler

The enormous growth of the World Wide Web in recent years has made it necessary to perform resource discovery efficiently. For a crawler it is not an simple task to download the domain specific web pages. This unfocused approach often shows undesired results. Therefore, several new ideas have been proposed, among them a key technique is focused crawling which is able to crawl particular topical...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006